Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the model for ReadableStream to have async read() #296

Merged
merged 9 commits into from
Mar 17, 2015

Conversation

domenic
Copy link
Member

@domenic domenic commented Mar 12, 2015

This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams.

This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be).

Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream.

Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }.

Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction.

This commit also adds some new infrastructure for templated tests, and ports some portion of the existing tests there. This is our solution for #217 and #264.

Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.


We apologize for this large of a change, this late in the game! We really got bit by not considering the zero-copy byte stream use case carefully enough. We spent a lot of concentrated time working out how this change will enable byte streams, and @tyoshino is working on a prototype implementation over in #287 which will allow us to bring byte streams into the main spec/reference implementation/test suite soon. In the end, it's painful to see this large of a diff, but the result will be a lot better from many angles---not only byte stream compatibility, but general developer-facing ergonomics as well.


Branch snapshot at https://streams.spec.whatwg.org/branch-snapshots/async-read-take-3/. Sections that are good for at-a-glance understanding of the change, or sections that have seen particular rework, include:


if (this._ownerReadableStream === undefined) {
return Promise.resolve(undefined);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return a promise rejected with the error if this was released when it got errored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended last commit in the PR with fix

function CreateReadableStreamCloseFunction(stream) {
return () => {
if (stream._state !== 'readable') {
return undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's out of the main topic of the PR, but don't we want to say "you're making a mistake" when close()-ed twice or more?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think you are right. Underlying source should be more careful than this. Open a new issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #298

@tyoshino
Copy link
Member

Took a look at readable-stream.js. Overall looks good.

@domenic
Copy link
Member Author

domenic commented Mar 16, 2015

Awesome. index.bs review would be good too (especially to the non-algorithm text). Tests might not be worth it.

@tyoshino
Copy link
Member

@domenic
Copy link
Member Author

domenic commented Mar 16, 2015

Yep!

This replaces the dual ready + read() approach previously, which was derived from the epoll(7) + read(2) paradigm. In #253 we did a deep dive on real-world implementation strategies for byte streams, and discovered that the ready + read() model causes a conflict with the semantics we want for byte streams. Briefly, because some byte streams will demand to know the size of the buffer they must fill before doing any I/O (the fread(3) model), the byte stream method that reads into a given typed array must be asynchronous. If such byte streams are then to conform to the readable stream interface, with a no-argument read() method that is an auto-allocating version of the read-into method, then read() must also be async, across all readable streams.

This is a usability upgrade for consumers, in many cases. However, for non-byte streams, it potentially costs more microtasks when multiple chunks of data would be available synchronously. We are hopeful that even if current JS engine microtask queues are not as fast as they could be, they will improve over time until this overhead is negligible (which in theory it should be).

Alongside this change, we moved the read() method from the readable stream to the reader (now called "readable stream reader" instead of "exclusive stream reader"). This drastically simplifies the stream/reader interaction, and also allows the possibility of different types of readers which have different reading behavior---again, very helpful for byte streams. The usability impact is also positive, as it guides consumers toward using piping instead of directly reading chunks from the stream.

Note that read() promises fulfill with { value, done } instead of using an EOS sentinel value. This avoids a number of problems (see extensive discussion in #253), and also provides a mechanism by which readable byte streams can smuggle out "unused" buffers given to them, by using { value: zeroLengthViewOntoBuffer, done: true }.

Finally, the state property is now removed (from readable stream), since there is no longer a "waiting" vs. "readable" distinction.

This commit also adds some new infrastructure for _templated tests_, and ports some portion of the existing tests there. This is our solution for #217 and #264.

Note that we re-merged all related code into a single readable-stream.js file, as the setup with the three separate files (readable-stream.js, exclusive-stream-reader.js, and readable-stream-abstract-ops.js) was problematic in causing circular dependencies.

- Closes #253!
- Closes #266 by simplifying reader-related stuff, removing the problematic `ready` promise, and ensuring that cancel() always returns a new promise instead of reusing [[closedPromise]].
- Closes #264 by introducing templated tests.
See discussion in #297. This commit implements the following changes:

- Allow acquiring readers for closed or errored streams; they simply act closed or errored.
- Stop auto-releasing readers when streams close/error.
- Disallow canceling a stream that is locked to a reader (you should use the reader cancel).
- Piping from a closed or errored stream will close or abort the destination stream, instead of immediately failing the pipe.
See discussion in #297. pipeTo now returns a { finished, unpipe() } object, instead of just a promise.
This reverts commit 0de787a. Per discussions in #297, we can punt on unpiping until we also figure out cancelable promises.

Conflicts:
	reference-implementation/lib/readable-stream.js
Per discussion in #297. We can always add it back later if we really want it, but it has become increasingly useless in the newer, reader-centric design.
The conservative thing to do is to not have this ability for now. Released readers are now indistinguishable from readers for closed streams.
As illustrated by the new tests, there were previously issues around auto-releasing not always giving the right result for errored or closed streams, especially with regard to allowing future readers to be acquired. This commit simplifies the logic and ensures consistency across all readers obtained from errored or closed streams.
@domenic domenic merged commit 0c6240f into master Mar 17, 2015
domenic added a commit to domenic/tcp-udp-sockets that referenced this pull request Mar 25, 2015
Readable stream transitioned from .ready + sync .read() to async .read() in whatwg/streams#296. This PR updates the examples to use this new API.

It also removes a lot of trailing whitespace. The diff might be best viewed without whitespace tracked.
domenic added a commit to domenic/tcp-udp-sockets that referenced this pull request Mar 28, 2015
Readable stream transitioned from .ready + sync .read() to async .read() in whatwg/streams#296. This PR updates the examples to use this new API.

It also removes a lot of trailing whitespace. The diff might be best viewed without whitespace tracked.
@domenic domenic deleted the async-read-take-3 branch April 20, 2015 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment